Sequencing of 2650 samples, dual-index 100bp paired-end.
After further trimming for PHRED quality score, illumina adapter content and alignment against Mouse Genome GRCm38 also called mm10:
##
## 1 2 3 4 5 6 7 8 9 10
## 263 429 573 610 250 410 66 10 38 1
We carefully select the samples that pass the 19 criteria according to the values specified in the table below:
| Criteria | Selection |
|---|---|
| #Genes | >= 1000 & <= 6500 |
| PercentMito | >= 0 & <= 0.006 |
| PercentERCC | >= 0 & <= 0.011 |
| adapter_content | PASS |
| Sequences.flagged.as.poor.quality | 0 |
| sequence_duplication_levels | PASS |
| avg_input_read_length | >= 180 & <= 200 |
| per_base_sequence_quality | PASS |
| sequence_length_distribution | PASS |
| basic_statistics | PASS |
| per_sequence_gc_content | PASS |
| total_reads | >= 20000 & <= 4e+06 |
| per_base_n_content | PASS |
| overrepresented_sequences | PASS |
| per_sequence_quality_scores | PASS |
| uniquely_mapped_percent | >= 68 & <= 100 |
| unmapped_tooshort_percent | >= 0 & <= 17 |
| mismatch_rate | >= 0.15 & <= 0.5 |
| multimapped_percent | >= 2.3 & <= 7.7 |
Quantitative criteria are represented here along with the chosen thresholds. Colour per tissue (GM or WM).
From 2650 single-cells sequenced, we end up with 2538 that passed quantitative QC (seen above).